{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 1. Visualization, Part 3: Getting data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook will examine how to get data into your program so that you can visualize them." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#Table of Contents\n", "* [1. Visualization, Part 3: Getting data](#1.-Visualization,-Part-3:-Getting-data)\n", "\t* [1.1 Direct](#1.1-Direct)\n", "\t* [1.2 Construct a CSV file](#1.2-Construct-a-CSV-file)\n", "\t* [1.3 Download CSV](#1.3-Download--CSV)\n", "\t* [1.4 Latitude, Longitude data in CSV: Plotting points on map](#1.4-Latitude,-Longitude-data-in-CSV:-Plotting-points-on-map)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.1 Direct" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The most straight-froward, if not a bit time-consuming, is to just enter the data directly into arrays in your program. This program contains two arrays: `states` and `pop`:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " Sketch #3:
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", "
\n", "Sketch #3 state: Loading...
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "String [] states = new String[50];\n", "int [] pop = new int[50];\n", "\n", "states[0] = \"AL\";\n", "states[1] = \"AK\";\n", "states[2] = \"AZ\";\n", "states[3] = \"AR\";\n", "states[4] = \"CA\";\n", "states[5] = \"CO\";\n", "states[6] = \"CT\";\n", "states[7] = \"DE\";\n", "states[8] = \"FL\";\n", "states[9] = \"GA\";\n", "states[10] = \"HI\";\n", "states[11] = \"ID\";\n", "states[12] = \"IL\";\n", "states[13] = \"IN\";\n", "states[14] = \"IA\";\n", "states[15] = \"KS\";\n", "states[16] = \"KY\";\n", "states[17] = \"LA\";\n", "states[18] = \"ME\";\n", "states[19] = \"MD\";\n", "states[20] = \"MA\";\n", "states[21] = \"MI\";\n", "states[22] = \"MN\";\n", "states[23] = \"MS\";\n", "states[24] = \"MO\";\n", "states[25] = \"MT\";\n", "states[26] = \"NE\";\n", "states[27] = \"NV\";\n", "states[28] = \"NH\";\n", "states[29] = \"NJ\";\n", "states[30] = \"NM\";\n", "states[31] = \"NY\";\n", "states[32] = \"NC\";\n", "states[33] = \"ND\";\n", "states[34] = \"OH\";\n", "states[35] = \"OK\";\n", "states[36] = \"OR\";\n", "states[37] = \"PA\";\n", "states[38] = \"RI\";\n", "states[39] = \"SC\";\n", "states[40] = \"SD\";\n", "states[41] = \"TN\";\n", "states[42] = \"TX\";\n", "states[43] = \"UT\";\n", "states[44] = \"VT\";\n", "states[45] = \"VA\";\n", "states[46] = \"WA\";\n", "states[47] = \"WV\";\n", "states[48] = \"WI\";\n", "states[49] = \"WY\";\n", "\n", "pop[0] = 4708708;\n", "pop[1] = 698473;\n", "pop[2] = 6595778;\n", "pop[3] = 2889450;\n", "pop[4] = 36961664;\n", "pop[5] = 5024748;\n", "pop[6] = 3518288;\n", "pop[7] = 885122;\n", "pop[8] = 18537969;\n", "pop[9] = 9829211;\n", "pop[10] = 1295178;\n", "pop[11] = 1545801;\n", "pop[12] = 12910409;\n", "pop[13] = 6423113;\n", "pop[14] = 3007856;\n", "pop[15] = 2818747;\n", "pop[16] = 4314113;\n", "pop[17] = 4492076;\n", "pop[18] = 1318301;\n", "pop[19] = 5699478;\n", "pop[20] = 6593587;\n", "pop[21] = 9969727;\n", "pop[22] = 5266214;\n", "pop[23] = 2951996;\n", "pop[24] = 5987580;\n", "pop[25] = 974989;\n", "pop[26] = 1796619;\n", "pop[27] = 2643085;\n", "pop[28] = 1324575;\n", "pop[29] = 8707739;\n", "pop[30] = 2009671;\n", "pop[31] = 19541453;\n", "pop[32] = 9380884;\n", "pop[33] = 646844;\n", "pop[34] = 11542645;\n", "pop[35] = 3687050;\n", "pop[36] = 3825657;\n", "pop[37] = 12604767;\n", "pop[38] = 1053209;\n", "pop[39] = 4561242;\n", "pop[40] = 812383;\n", "pop[41] = 6296254;\n", "pop[42] = 24782302;\n", "pop[43] = 2784572;\n", "pop[44] = 621760;\n", "pop[45] = 7882590;\n", "pop[46] = 6664195;\n", "pop[47] = 1819777;\n", "pop[48] = 5654774;\n", "pop[49] = 544270;\n", "\n", "PShape usa;\n", "\n", "void setup() {\n", " size(959, 593); \n", " usa = loadShape(\"usa-wikipedia.svg\");\n", "}\n", "\n", "void draw() {\n", " background(255);\n", " shape(usa, 0, 0);\n", " for (int i = 0; i < 50; i++) { \n", " st = usa.getChild(states[i]);\n", " st.disableStyle();\n", " // Scale the color between 255 and 0:\n", " c = 255 * (pop[i] - min(pop))/(max(pop) - min(pop));\n", " // Make the colors go between blue and red:\n", " fill(255 - c, 0, c);\n", " shape(st, 0, 0); \n", " }\n", " noLoop();\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.2 Construct a CSV file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another method is to construct a CSV file in the notebook using the `%%file` magic. `%%file` takes the name of the file, followed by the contents of the file. The first line of the file is the \"header\" that provides the names of the columns:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created file '/home/dblank/Public/CS110 Intro to Computing/2015/Lectures/test.csv'.\n" ] } ], "source": [ "%%file test.csv\n", "\"State\",\"Population\"\n", "AL,4708708\n", "AK,698473\n", "AZ,6595778\n", "AR,2889450\n", "CA,36961664\n", "CO,5024748\n", "CT,3518288\n", "DE,885122\n", "FL,18537969\n", "GA,9829211\n", "HI,1295178\n", "ID,1545801\n", "IL,12910409\n", "IN,6423113\n", "IA,3007856\n", "KS,2818747\n", "KY,4314113\n", "LA,4492076\n", "ME,1318301\n", "MD,5699478\n", "MA,6593587\n", "MI,9969727\n", "MN,5266214\n", "MS,2951996\n", "MO,5987580\n", "MT,974989\n", "NE,1796619\n", "NV,2643085\n", "NH,1324575\n", "NJ,8707739\n", "NM,2009671\n", "NY,19541453\n", "NC,9380884\n", "ND,646844\n", "OH,11542645\n", "OK,3687050\n", "OR,3825657\n", "PA,12604767\n", "RI,1053209\n", "SC,4561242\n", "SD,812383\n", "TN,6296254\n", "TX,24782302\n", "UT,2784572\n", "VT,621760\n", "VA,7882590\n", "WA,6664195\n", "WV,1819777\n", "WI,5654774\n", "WY,544270" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can then load the CSV file in using the following `import` and `loadTable()` function. There are a few things of interest:\n", "\n", "* Need to use `import processing.table.*;`\n", "* Use `loadTable(FILENAME, \"header\");` to load the data\n", "* Use `table.getRowCount()` to get the total number of lines of data; doesn't count header row\n", "* Use new form of `for` to loop through data, like: `for (TableRow row : table.rows()) { ...}`\n", "* Use `row.getString(COLUMN_NAME)`, `row.getInt(COLUMN_NAME)`, or `row.getFloat(COLUMN_NAME)` to get the value of that row/column" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " Sketch #4:
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", "
\n", "Sketch #4 state: Loading...
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import processing.table.*;\n", "\n", "Table table;\n", "\n", "void setup() {\n", " table = loadTable(\"test.csv\", \"header\");\n", " println(table.getRowCount() + \" total rows in table\"); \n", " size(959, 593); \n", " usa = loadShape(\"usa-wikipedia.svg\");\n", "}\n", "\n", "long findMax() {\n", " int retval = 0;\n", " for (TableRow row : table.rows()) {\n", " pop = row.getInt(\"Population\");\n", " if (pop > retval)\n", " retval = pop;\n", " }\n", " return retval;\n", "}\n", "\n", "void draw() {\n", " background(255);\n", " shape(usa, 0, 0);\n", " max = findMax();\n", " for (TableRow row : table.rows()) {\n", " state = row.getString(\"State\");\n", " pop = row.getInt(\"Population\");\n", " st = usa.getChild(state);\n", " st.disableStyle();\n", " // Portion of 255:\n", " c = 255 * pop/max;\n", " // From red to black:\n", " fill(255 - c, 0, 0);\n", " shape(st, 0, 0); \n", " }\n", " noLoop();\n", "}" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " Sketch #6:
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", "
\n", "Sketch #6 state: Loading...
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import processing.table.*;\n", "\n", "Table table;\n", "\n", "void setup() {\n", " size(500, 500);\n", " table = loadTable(\"test.csv\", \"header\");\n", "}\n", "\n", "long findSum() {\n", " int retval = 0;\n", " for (TableRow row : table.rows()) {\n", " pop = row.getInt(\"Population\");\n", " retval += pop;\n", " }\n", " return retval;\n", "}\n", "\n", "int findMax() {\n", " int mmax = 0;\n", " for (TableRow row : table.rows()) {\n", " pop = row.getInt(\"Population\");\n", " if (pop > mmax)\n", " mmax = pop;\n", " }\n", " return mmax;\n", "}\n", "\n", "int findMin() {\n", " int mmin = 999999999999;\n", " for (TableRow row : table.rows()) {\n", " pop = row.getInt(\"Population\");\n", " if (pop < mmin)\n", " mmin = pop;\n", " }\n", " return mmin;\n", "}\n", "\n", "\n", "int[] rotateAround(int x, int y, int length, int angle) {\n", " return new int[2] {x + length * cos(angle),\n", " y - length * sin(angle)};\n", "}\n", "\n", "void draw() {\n", " background(255);\n", " sum = findSum();\n", " // First, draw the pies:\n", " start = 0;\n", " for (TableRow row : table.rows()) {\n", " state = row.getString(\"State\");\n", " pop = row.getInt(\"Population\");\n", " //fill(0);\n", " fill(255 - 255 * (pop - findMin())/(findMax() - findMin()), 0, 0);\n", " arc(width/2, height/2, width, height, start, start + pop/sum * 2 * PI);\n", " start += pop/sum * 2 * PI; \n", " }\n", " // Next, draw the labels:\n", " start = 0;\n", " for (TableRow row : table.rows()) {\n", " state = row.getString(\"State\");\n", " pop = row.getInt(\"Population\");\n", " int[] xy = rotateAround(width/2, height/2, height/2.2, -(start + pop/sum * PI));\n", " // Draw shadow, and the text:\n", " fill(0);\n", " text(state, xy[0] + 1, xy[1] + 1);\n", " fill(255);\n", " text(state, xy[0], xy[1]);\n", " // Increment by the size of the pie:\n", " start += pop/sum * 2 * PI; \n", " }\n", " noLoop();\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.3 Download CSV" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, you can find a CSV file (or make one in Excel or another Spreadsheet program) and either upload it to you notebook server, or download it right here from the Internet:" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloaded 'mammals.csv'.\n" ] } ], "source": [ "%download http://www.departments.bucknell.edu/biology/resources/msw3/export.asp -f mammals.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This takes the first 101 lines and stores it in a file called `mammals_100.csv`." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "!head -101 mammals.csv > mammals_100.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This checks to see how many lines are in a file:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 13583\t\"14300175\",\"CETACEA\",\"ODONTOCETI\",\"\",\"\",\"Ziphiidae\",\"\",\"\",\"Ziphius\",\"\",\"cavirostris\",\"\",\"SPECIES\",\"False\",\"\",\"YES\",\"G. Cuvier\",\"1823\",\"\",\"Rech. Oss. Foss., Nouv. ed.\",\"5\",\"1\",\"350\",\"\",\"\",\"Cuviers Beaked Whale\",\"France, \"\"dans le dpartement des Bouches-du-Rhne, entre de Fos et l'embouchure du Galgeon\"\" (= between Fos and the mouth of the Galgeon River).\",\"Worldwide: cold-temperate to tropical waters.\",\"CITES Appendix II; IUCN Data Deficient.\",\"australis (Burmeister, 1865); capensis (Gray, 1865); chathamensis (Hector, 1873); indicus Van Beneden, 1863.\",\"Reviewed by Heyning (1989b).\",\"43\",\"43-00175\",\"43-0001-0034-0000-0000-0146-0000-0000-0174-0000-0175\"\r", "\r\n", "\n" ] } ], "source": [ "! cat -n mammals.csv | tail -1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This file has 13,583 lines.\n", "\n", "This command lists the first line of the file to see the headers:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\"ID\",\"Order\",\"Suborder\",\"Infraorder\",\"Superfamily\",\"Family\",\"Subfamily\",\"Tribe\",\"Genus\",\"Subgenus\",\"Species\",\"Subspecies\",\"TaxonLevel\",\"Extinct?\",\"OriginalName\",\"ValidName\",\"Author\",\"Date\",\"ActualDate\",\"CitationName\",\"CitationVolume\",\"CitationIssue\",\"CitationPages\",\"CitationType\",\"TypeSpecies\",\"CommonName\",\"TypeLocality\",\"Distribution\",\"Status\",\"Synonyms\",\"Comments\",\"File\",\"SortOrder\",\"DisplayOrder\"\r", "\r\n", "\n" ] } ], "source": [ "!head -1 mammals_100.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we load the file and print out one column:" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " Sketch #52:
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", "
\n", "Sketch #52 state: Loading...
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import processing.table.*;\n", "\n", "Table table;\n", "\n", "void setup() {\n", " table = loadTable(\"mammals_100.csv\", \"header\");\n", " println(table.getRowCount() + \" total rows in table\"); \n", " size(959, 593); \n", " usa = loadShape(\"usa-wikipedia.svg\");\n", "}\n", "\n", "void draw() {\n", " background(255);\n", " for (TableRow row : table.rows()) {\n", " order = row.getString(\"Species\");\n", " println(order);\n", " }\n", " noLoop();\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.4 Latitude, Longitude data in CSV: Plotting points on map" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this exmaple, consider that we have latitude/longitude data that we would like to plot in specific locations on the map.\n", "\n", "First, we make a CSV file:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created file '/home/dblank/Public/CS110 Intro to Computing/2015/Lectures/states.csv'.\n" ] } ], "source": [ "%%file states.csv\n", "\"Latitude\",\"Longitude\",\"City\",\"State\"\n", "33.57,86.75,Birmingham,AL\n", "34.65,86.77,Huntsville,AL\n", "30.68,88.25,Mobile,AL\n", "32.30,86.40,Montgomery,AL\n", "32.34,86.99,Selma,AL\n", "31.87,86.02,Troy,AL\n", "33.23,87.62,Tuscaloosa,AL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And then we read it in, getting the data from the file and displaying in a visually-pleasing way. " ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " Sketch #18:
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", "
\n", "Sketch #18 state: Loading...
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import processing.table.*;\n", "\n", "PShape usa;\n", "Table table;\n", "\n", "void setup() {\n", " table = loadTable(\"states.csv\", \"header\");\n", " size(959, 593); \n", " usa = loadShape(\"usa-wikipedia.svg\");\n", "}\n", "\n", "float[] albers(lat, lng) {\n", " lat0 = 23.0 * (PI/180); // Latitude_Of_Origin\n", " lng0 = -96.0 * (PI/180); // Central_Meridian\n", " phi1 = 30.0 * (PI/180); // Standard_Parallel_1\n", " phi2 = 50.0 * (PI/180); // Standard_Parallel_2\n", "\n", " n = 0.5 * (sin(phi1) + sin(phi2));\n", " c = cos(phi1);\n", " C = c * c + 2 * n * sin(phi1);\n", " p0 = sqrt(C - 2 * n * sin(lat0)) / n;\n", " theta = n * (lng * PI/180 - lng0);\n", " p = sqrt(C - 2 * n * sin(lat * PI/180)) / n;\n", " x = p * sin(theta);\n", " y = p0 - p * cos(theta);\n", " return new float[2] { x, y };\n", "}\n", "\n", "void plot(lat, lon, city, c, radius) {\n", " // Values to scale the lat, lon to fit on the USA map:\n", " xoffset = 485;\n", " xscale = 1245;\n", " yoffset = 630; \n", " yscale = 1250;\n", " \n", " float[2] xy = albers(lat, lon);\n", " fill(c);\n", " ellipse(xoffset + xy[0] * xscale, yoffset - xy[1] * yscale, radius, radius);\n", " fill(0);\n", " text(city, xoffset + xy[0] * xscale + 1, yoffset - xy[1] * yscale + 1);\n", " fill(255);\n", " text(city, xoffset + xy[0] * xscale, yoffset - xy[1] * yscale);\n", "}\n", "\n", "void draw() {\n", " background(255);\n", " shape(usa, 0, 0); \n", " for (TableRow row : table.rows()) {\n", " lat = row.getFloat(\"Latitude\");\n", " lon = row.getFloat(\"Longitude\");\n", " city = row.getString(\"City\");\n", " plot(lat, -lon, city, color(random(255), random(255), random(255), 128), random(5, 20));\n", " }\n", " noLoop();\n", "}" ] } ], "metadata": { "kernelspec": { "display_name": "Calysto Processing", "language": "processing", "name": "calysto_processing" }, "language_info": { "codemirror_mode": { "name": "text/x-java", "version": 2 }, "file_extension": ".java", "mimetype": "text/x-java", "name": "java" } }, "nbformat": 4, "nbformat_minor": 0 }